A Survey of Bitmap Index-Compression Algorithms for Big Data

نویسندگان

  • Zhen Chen
  • Yuhao Wen
  • Wenxun Zheng
  • Jiahui Chang
  • Guodong Peng
  • Yinjun Wu
  • Ge Ma
  • Mourad Hakmaoui
  • Junwei Cao
چکیده

With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems (ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. In this paper, we survey bitmap-index compression algorithms for traffic archival systems. The current state-of-the-art bitmap-index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, and COMPAX. Based on differences in segmentation, chunking, merge compress, and Near Identical (NI) features, we provide a thorough categorization of the state-of-the-art bitmap compression algorithms. We also propose some new bitmap encoding algorithms-SECOMPAX, ICX, MASC, PLWAH+-and show the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap-index compression algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Bitmap Index Compression by Data Reorganization

The volume of data generated by scientific applications through observations or computer simulations can reach to the order of the petabytes. This brings up the need for effective and compact indexing methods for efficient storage and retrieval of scientific data. Bitmap indexing has been successfully applied in this domain by exploiting the fact that scientific data are mostly read-only and en...

متن کامل

Bitmap Indices for Speeding Up High-Dimensional Data Analysis

Bitmap indices have gained wide acceptance in data warehouse applications and are an efficient access method for querying large amounts of read-only data. The main trend in bitmap index research focuses on typical business applications based on discrete attribute values. However, scientific data that is mostly characterised by non-discrete attributes cannot be queried efficiently by currently s...

متن کامل

Data Compression for Bitmap Indexes

Compression Ratio (CR) and Logical Operation Time (LOT) are two major measures of the efficiency of bitmap indexing. Previous works by [5, 9, 10, 11] compare the performance of bitmap compression schemes conducted separately on logical operation time and compression ratio. This paper will describe these works and recommend for consideration a new matrix – overall efficiency indicator. The overa...

متن کامل

Bitmap Indices for Data Warehouses

In this chapter we discuss various bitmap index technologies for efficient query processing in data warehousing applications. We review the existing literature and organize the technology into three categories, namely bitmap encoding, compression and binning. We introduce an efficient bitmap compression algorithm and examine the space and time complexity of the compressed bitmap index on large ...

متن کامل

Genetic Algorithms and Cellular Automata: unraveling the Bitmap Problem

Using Genetic Algorithms to evolve Cellular Automata rules to solve a given problem is a well-known method. The Bitmap Problem however, with it’s versatile and challenging characteristics, remains quite unknown. This thesis focuses on the Bitmap Problem and tries to expose its inner workings, possibilities and pitfalls. Multiple aspects of the Bitmap Problem like grid size, state set and updati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014